23 research outputs found

    Extensible FlexRay communication controller for FPGA-based automotive systems

    Get PDF
    Modern vehicles incorporate an increasing number of distributed compute nodes, resulting in the need for faster and more reliable in-vehicle networks. Time-triggered protocols such as FlexRay have been gaining ground as the standard for high-speed reliable communications in the automotive industry, marking a shift away from the event-triggered medium access used in controller area networks (CANs). These new standards enable the higher levels of determinism and reliability demanded from next-generation safety-critical applications. Advanced applications can benefit from tight coupling of the embedded computing units with the communication interface, thereby providing functionality beyond the FlexRay standard. Such an approach is highly suited to implementation on reconfigurable architectures. This paper describes a field-programmable gate array (FPGA)-based communication controller (CC) that features configurable extensions to provide functionality that is unavailable with standard implementations or off-the-shelf devices. It is implemented and verified on a Xilinx Spartan 6 FPGA, integrated with both a logic-based hardware ECU and a fully fledged processor-based electronic control unit (ECU). Results show that the platform-centric implementation generates a highly efficient core in terms of power, performance, and resource utilization. We demonstrate that the flexible extensions help enable advanced applications that integrate features such as fault tolerance, timeliness, and security, with practical case studies. This tight integration between the controller, computational functions, and flexible extensions on the controller enables enhancements that open the door for exciting applications in future vehicles

    Accelerated artificial neural networks on FPGA for fault detection in automotive systems

    Get PDF
    Modern vehicles are complex distributed systems with critical real-time electronic controls that have progressively replaced their mechanical/hydraulic counterparts, for performance and cost benefits. The harsh and varying vehicular environment can induce multiple errors in the computational/communication path, with temporary or permanent effects, thus demanding the use of fault-tolerant schemes. Constraints in location, weight, and cost prevent the use of physical redundancy for critical systems in many cases, such as within an internal combustion engine. Alternatively, algorithmic techniques like artificial neural networks (ANNs) can be used to detect errors and apply corrective measures in computation. Though adaptability of ANNs presents advantages for fault-detection and fault-tolerance measures for critical sensors, implementation on automotive grade processors may not serve required hard deadlines and accuracy simultaneously. In this work, we present an ANN-based fault-tolerance system based on hybrid FPGAs and evaluate it using a diesel engine case study. We show that the hybrid platform outperforms an optimised software implementation on an automotive grade ARM Cortex M4 processor in terms of latency and power consumption, also providing better consolidation

    Virtualized FPGA accelerators for efficient cloud computing

    Get PDF
    Hardware accelerators implement custom architectures to significantly speed up computations in a wide range of domains. As performance scaling in server-class CPUs slows, we propose the integration of hardware accelerators in the cloud as a way to maintain a positive performance trend. Field programmable gate arrays (FPGAs) represent the ideal way to integrate accelerators in the cloud, since they can be reprogrammed as needs change and allow multiple accelerators to share optimised communication infrastructure. We discuss a framework that integrates reconfigurable accelerators in a standard server with virtualised resource management and communication. We then present a case study that quantifies the efficiency benefits and break-even point for integrating FPGAs in the cloud

    Network enabled partial reconfiguration for distributed FPGA edge acceleration

    Get PDF
    Partial reconfiguration supports virtualisation of applications on FPGAs, enabling compute to dynamically adapt to workloads in distributed infrastructure and datecenters. While the latter often makes use of the PCIe interface and supporting infrastructure to allocate and load compute kernels via a host CPU, FPGAs are becoming increasingly popular as standalone resources in edge-computing, requiring them to manage ac- celerators autonomously. This paper presents a platform that supports the managing of accelerator bitstreams over the network interface on a Xilinx Zynq device without intervention by the Arm processor. We compare against traditional vendor provided PR management for both library accelerators and custom acceler- ators and show that we achieve a 29% decrease in reconfiguration trigger latency using this approach

    A power and time efficient radio architecture for LDACS1 air-to-ground communication

    Get PDF
    L-band Digital Aeronautical Communication System (LDACS) is an emerging standard that aims at enhancing air traffic management by transitioning the traditional analog aeronautical communication systems to the superior and highly efficient digital domain. The standard places stringent requirements on the communication channels to allow them to coexist with critical L-band systems, requiring complex processing and filters in baseband. Approaches based on cognitive radio are also proposed since this allows tremendous increase in communication capacity and spectral efficiency. This requires high computational capability in airborne vehicles that can perform the complex filtering and masking, along with tasks associated with cognitive radio systems like spectrum sensing and baseband adaptation, while consuming very less power. This paper proposes a radio architecture based on new generation FPGAs that offers advanced capabilities like partial reconfiguration. The proposed architecture allows non-concurrent baseband modules to be dynamically loaded only when they are required, resulting in improved energy efficiency, without sacrificing performance. We evaluate the case of non-concurrent spectrum sensing logic and transmission filters on our cognitive radio platform based on Xilinx Zynq, and show that our approach results in 28.3% reduction in DSP utilisation leading to lower energy consumption at run-time

    Fracturable DSP block for multi-context reconfigurable architectures

    Get PDF
    Multi-context architectures like NATURE enable low-power applications to leverage fast context switching for improved energy efficiency and lower area footprint. The NATURE architecture incorporates 16-bit reconfigurable DSP blocks for accelerating arithmetic computations, however, their fixed precision prevents efficient re-use in mixed-width arithmetic circuits. This paper presents an improved DSP block architecture for NATURE, with native support for temporal folding and run-time fracturability. The proposed DSP block can compute multiple sub-width operations in the same clock cycle and can dynamically switch between sub-width and full-width operations in different cycles. The NanoMap tool for mapping circuits onto NATURE is extended to exploit the fracturable multiplier unit incorporated in the DSP block. We demonstrate the efficiency of the proposed dynamically fracturable DSP block by implementing logic-intensive and compute-intensive benchmark applications. Our results illustrate that the fracturable DSP block can achieve a 53.7% reduction in DSP block utilization and a 42.5% reduction in area with a 122.5% reduction in power-delay product without exploiting logic folding. We also observe an average reduction of 6.43% in power-delay product for circuits that utilize NATURE’s temporal folding compared to the existing full precision DSP block in NATURE, leading to highly compact, energy efficient designs

    Quantifying the latency benefits of near-edge and in-network FPGA acceleration

    Get PDF
    Transmitting data to cloud datacenters in distributed IoT applications introduces significant communication latency, but is often the only feasible solution when source nodes are computationally limited. To address latency concerns, cloudlets, in-network computing, and more capable edge nodes are all being explored as a way of moving processing capability towards the edge of the network. Hardware acceleration using Field Programmable Gate Arrays (FPGAs) is also seeing increased interest due to reduced computation latency and improved efficiency. This paper evaluates the the implications of these offloading approaches using a case study neural network based image classification application, quantifying both the computation and communication latency resulting from different platform choices. We consider communication latency including the ingestion of packets for processing on the target platform, showing that this varies significantly with the choice of platform. We demonstrate that emerging in-network accelerator approaches offer much improved and predictable performance as well as better scaling to support multiple data sources

    Build automation and runtime abstraction for partial reconfiguration on Xilinx Zynq UltraScale+

    Get PDF
    Partial reconfiguration (PR) is fundamental to build- ing adaptive systems on modern FPGA SoCs, where hardware can be adapted dynamically at runtime. Vendor supported reconfiguration is performance limited, drivers entail complex memory management, and software/hardware design requires detailed knowledge of the underlying hardware. This paper presents a collection of abstractions that provide high performance reconfiguration of hardware from within the Linux userspace, automating the process of building PR applications, and adding support for the Xilinx Zynq UltraScale+ architecture. We compare our abstractions against vendor tooling for PR management and open source tools supporting PR within Linux. Our tools provides automation and abstraction layers, from defining PR configurations through to compiling and packaging Linux with support for userspace PR control, targeted for non- experts

    VEGa : a high performance vehicular Ethernet gateway on hybrid FPGA

    Get PDF
    Modern vehicles employ a large amount of distributed computation and require the underlying communication scheme to provide high bandwidth and low latency. Existing communication protocols like Controller Area Network (CAN) and FlexRay do not provide the required bandwidth, paving the way for adoption of Ethernet as the next generation network backbone for in-vehicle systems. Ethernet would co-exist with safety-critical communication on legacy networks, providing a scalable platform for evolving vehicular systems. This requires a high-performance network gateway that can simultaneously handle high bandwidth, low latency, and isolation; features that are not achievable with traditional processor based gateway implementations. We present VEGa, a configurable vehicular Ethernet gateway architecture utilising a hybrid FPGA to closely couple software control on a processor with dedicated switching circuit on the reconfigurable fabric. The fabric implements isolated interface ports and an accelerated routing mechanism, which can be controlled and monitored from software. Further, reconfigurability enables the switching behaviour to be altered at run-time under software control, while the configurable architecture allows easy adaptation to different vehicular architectures using high-level parameter settings. We demonstrate the architecture on the Xilinx Zynq platform and evaluate the bandwidth, latency, and isolation using extensive tests in hardware
    corecore